Alinha-PB: A Phonetic Aligner for Brazilian Portuguese

نویسندگان

چکیده

Phonetic alignment is the task of finding limits phones and higher units in an audio file. This has been reliably done many languages such as English, French German, but, so far, no available Brazilian Portuguese aligner had a performance comparable with ones used for these other languages. Thus, main goal this work was to implement useful tool forced Portuguese. The implementation two steps, grapheme-to-phoneme conversion itself. Converter responsible receiving input transcription graphemes converting it its equivalent phonemes allophones, implemented using computational rules derived from analysis regular grapheme-phoneme relations exception dictionary, words which could be applied. Aligner aligning phonemes/allophones previous module corresponding acoustic intervals file, called "phones". hidden Markov models. Results have accuracy over 99%, where mistakes involved mid vowels /e/ /ɛ/ /o/ /ɔ/. As Aligner, best model 87% alignments errors below 25 ms.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards a Phonetic Brazilian Portuguese Spell Checker

Spell checking is no longer considered a big challenge for natural language processing, at least regarding the task of correcting documents during edition. Nevertheless, without human interaction, it is necessary to automatically choose the word that will more likely correct the misspelled word. Also, there is a further difficulty for spell checking: new types of errors on the web material have...

متن کامل

UNITEX-PB, a set of flexible language resources for Brazilian Portuguese∗

This work documents the project and development of various computational linguistic resources that support the Brazilian Portuguese language according to the formal methodology used by the corpus processing system called UNITEX. The delivered resources include computational lexicons, libraries to access compressed lexicons, and additional tools to validate those resources.

متن کامل

Voice quality analysis from a phonetic perspective: Voice Profile Analysis Scheme (VPAS) Profile for Brazilian Portuguese

The present study aimed at presenting the instructional material developed in the Brazilian Portuguese context to apply the Voice Profile Analysis Scheme-VPAS (PB-VPAS) for the perceptual evaluation of voice quality and at reporting preliminary data analyzed from a group of six judges who attended a workshop on VPAS. The adaptation of the VPAS into Brazilian Portuguese was accomplished and the ...

متن کامل

Evaluating the LIHLA lexical aligner on Spanish, Brazilian Portuguese and Basque parallel texts

Alignment of words and multiword units plays an important role in many natural language processing applications, such as example-based machine translation, transfer rule learning for machine translation, bilingual lexicography, word sense disambiguation, etc. In this paper we describe LIHLA, a lexical aligner which uses bilingual probabilistic lexicons generated by a freely available set of too...

متن کامل

A Normalizer for UGC in Brazilian Portuguese

User-generated contents (UGC) represent an important source of information for governments, companies, political candidates and consumers. However, most of the Natural Language Processing tools and techniques are developed from and for texts of standard language, and UGC is a type of text especially full of creativity and idiosyncrasies, which represents noise for NLP purposes. This paper prese...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Communication and Information Systems

سال: 2021

ISSN: ['1980-6604', '1980-6612']

DOI: https://doi.org/10.14209/jcis.2021.21